Achieving multiple genres and long-term choreography sequences from given music is a challenging task, due to the lack of a multi-genre dataset. To tackle this problem,we propose a Multi Art Genre Intelligent Choreography Dataset (MagicDance). The data of MagicDance is captured from professional dancers assisted by motion capture technicians. It has a total of 8 hours 3D motioncapture human dances with paired music, and 16 different dance genres. To the best of our knowledge, MagicDance is the 3D dance dataset with the most genres. In addition, we find that the existing two types of methods (generation-based method and synthesis-based method) can only satisfy one of the diversity and duration, but they can complement to some extent. Based on this observation, we also propose a generation-synthesis choreography network (MagicNet), which cascades a Diffusion-based 3D Diverse Dance fragments Generation Network (3DGNet) and a Genre&Coherent aware Retrieval Module (GCRM). The former can generate various dance fragments from only one music clip. The latter is utilized to select the best dance fragment generated by 3DGNet and switch them into a complete dance according to the genre and coherent matching score. Quantitative and qualitative experiments demonstrate the quality of MagicDance, and the state-of-the-art performance of MagicNet.
translated by 谷歌翻译
电线杆和建筑物边缘经常是城市道路上可观察到的对象,为各种计算机视觉任务提供了可靠的提示。为了重复提取它们作为特征并在离散激光镜头框架之间进行注册,我们提出了第一个基于学习的功能分割和LIDAR点云中3D线的描述模型。为了训练我们的模型,而无需耗时和乏味的数据标记过程,我们首先生成了目标线基本外观的合成原始图,并构建一个迭代线自动标记的过程,以逐步完善真实激光扫描的线路标签。我们的分割模型可以在任意规模的扰动下提取线,我们使用共享的EDGECONV编码层共同训练两个分割和描述符头。基于模型,我们可以在没有初始转换提示的情况下构建一个高度可用的全局注册模块,用于点云注册。实验表明,我们基于线的注册方法对基于最先进的方法的方法具有很高的竞争力。我们的代码可在https://github.com/zxrzju/superline3d.git上找到。
translated by 谷歌翻译
我们提出了一种精确,有效的正常估计方法,可以处理非结构化3D点云的噪声和不均匀密度。与直接采用补丁并忽略当地邻里关系的现有方法不同,这使它们容易受到诸如尖锐边缘等挑战区域的影响,我们建议学习以正常估计的图形卷积特征表示,该图表强调了更多的本地邻里几何形状,并有效地编码了内在关系。此外,我们根据注意机制设计了一种新型的自适应模块,以将点特征与其相邻特征整合在一起,从而进一步增强了提出的正常估计器对点密度变化的鲁棒性。为了使其更有区别,我们在图形块中引入了多尺度体系结构,以学习更丰富的几何特征。我们的方法以各种基准数据集的最先进的精度优于竞争对手,并且对噪声,异常值以及密度变化非常有力。
translated by 谷歌翻译
最近,对分布(OOD)数据具有相关性转移的概括引起了极大的关注。相关转移是由与类标签相关的虚假属性引起的,因为它们之间的相关性可能在训练和测试数据中有所不同。对于这样一个问题,我们表明,鉴于类标签,有条件独立的虚假属性模型是可推广的。基于此,提出了控制OOD泛化误差的度量条件伪变异(CSV),以衡量这种条件独立性。为了改善OOD的概括,我们将培训过程正常使用拟议的CSV。在温和的假设下,我们的训练目标可以作为非Convex-Concave Mini-Max问题提出。提出了具有可证明的收敛速率的算法来解决该问题。广泛的经验结果验证了我们算法在改善OOD概括方面的功效。
translated by 谷歌翻译
现有的置换不变方法可以根据聚合范围(即全球聚合和局部局部)分为两类。尽管全局聚合方法,e。 g。,PointNet和Deep Sets,参与更简单的结构,它们的性能比PointNet ++和Point Transformer等局部聚合较差。如果存在具有简单结构,竞争性能甚至更少参数的全球聚合方法,那么它仍然是一个空旷的问题。在本文中,我们提出了一个基于双MLP点产品的新型全局聚合置换不变的网络,称为DUMLP-PIN,该网络能够用于提取集合输入的功能,包括无序或非结构的像素,属性,atter和Point和Point和Point云数据集。我们严格地证明,DUMLP-PIN实现的任何置换不变函数都可以通过点产生方式分解为两个或多个置换量的函数,因为给定输入集的基数大于阈值。我们还表明,在某些条件下,可以将DUMLP针视为具有强大限制的深度集。 DUMLP-PIN的性能在具有不同数据集的几个不同任务上进行了评估。实验结果表明,我们的DUMLP-PIN在像素集和属性集的两个分类问题上取得了最佳结果。在点云分类和零件分割上,DUMLP-PIN的准确性非常接近SO-FAR最佳表现最佳的本地聚合方法,仅差异1-2%,而所需参数的数量显着降低了分类分别超过85%和69%的分割。该代码可在https://github.com/jaronthu/dumlp-pin上公开获得。
translated by 谷歌翻译
这项工作提出了一种新的计算框架,用于学习用于真实数据集的明确生成模型。特别地,我们建议在包含多个独立的多维线性子空间组成的特征空间中的多类多维数据分发和{线性判别表示(LDR)}之间学习{\ EM闭环转录}。特别地,我们认为寻求的最佳编码和解码映射可以被配制为编码器和解码器之间的{\ em二手最小游戏的均衡点}。该游戏的自然实用功能是所谓的{\ em速率减少},这是一个简单的信息定理措施,用于特征空间中子空间类似的高斯的混合物之间的距离。我们的配方利用来自控制系统的闭环误差反馈的灵感,避免昂贵的评估和最小化数据空间或特征空间的任意分布之间的近似距离。在很大程度上,这种新的制定统一了自动编码和GaN的概念和益处,并自然将它们扩展到学习多级和多维实际数据的判别和生成}表示的设置。我们对许多基准图像数据集的广泛实验表明了这种新的闭环配方的巨大潜力:在公平的比较下,学习的解码器的视觉质量和编码器的分类性能是竞争力的,并且通常比基于GaN,VAE或基于GaN,VAE或基于GaN,VAE的方法更好的方法两者的组合。我们注意到所以,不同类别的特征在特征空间中明确地映射到大约{em独立的主管子空间};每个类中的不同视觉属性由每个子空间中的{\ em独立主体组件}建模。
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译
Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
translated by 谷歌翻译
Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English. In this work, we present MultiSpider, the largest multilingual text-to-SQL dataset which covers seven languages (English, German, French, Spanish, Japanese, Chinese, and Vietnamese). Upon MultiSpider, we further identify the lexical and structural challenges of text-to-SQL (caused by specific language properties and dialect sayings) and their intensity across different languages. Experimental results under three typical settings (zero-shot, monolingual and multilingual) reveal a 6.1% absolute drop in accuracy in non-English languages. Qualitative and quantitative analyses are conducted to understand the reason for the performance drop of each language. Besides the dataset, we also propose a simple schema augmentation framework SAVe (Schema-Augmentation-with-Verification), which significantly boosts the overall performance by about 1.8% and closes the 29.5% performance gap across languages.
translated by 谷歌翻译
In this paper, we present a pure-Python open-source library, called PyPop7, for black-box optimization (BBO). It provides a unified and modular interface for more than 60 versions and variants of different black-box optimization algorithms, particularly population-based optimizers, which can be classified into 12 popular families: Evolution Strategies (ES), Natural Evolution Strategies (NES), Estimation of Distribution Algorithms (EDA), Cross-Entropy Method (CEM), Differential Evolution (DE), Particle Swarm Optimizer (PSO), Cooperative Coevolution (CC), Simulated Annealing (SA), Genetic Algorithms (GA), Evolutionary Programming (EP), Pattern Search (PS), and Random Search (RS). It also provides many examples, interesting tutorials, and full-fledged API documentations. Through this new library, we expect to provide a well-designed platform for benchmarking of optimizers and promote their real-world applications, especially for large-scale BBO. Its source code and documentations are available at https://github.com/Evolutionary-Intelligence/pypop and https://pypop.readthedocs.io/en/latest, respectively.
translated by 谷歌翻译